176 research outputs found

    LICSS - a chemical spreadsheet in microsoft excel

    Get PDF
    Abstract Background Representations of chemical datasets in spreadsheet format are important for ready data assimilation and manipulation. In addition to the normal spreadsheet facilities, chemical spreadsheets need to have visualisable chemical structures and data searchable by chemical as well as textual queries. Many such chemical spreadsheet tools are available, some operating in the familiar Microsoft Excel environment. However, within this group, the performance of Excel is often compromised, particularly in terms of the number of compounds which can usefully be stored on a sheet. Summary LICSS is a lightweight chemical spreadsheet within Microsoft Excel for Windows. LICSS stores structures solely as Smiles strings. Chemical operations are carried out by calling Java code modules which use the CDK, JChemPaint and OPSIN libraries to provide cheminformatics functionality. Compounds in sheets or charts may be visualised (individually or en masse), and sheets may be searched by substructure or similarity. All the molecular descriptors available in CDK may be calculated for compounds (in batch or on-the-fly), and various cheminformatic operations such as fingerprint calculation, Sammon mapping, clustering and R group table creation may be carried out. We detail here the features of LICSS and how they are implemented. We also explain the design criteria, particularly in terms of potential corporate use, which led to this particular implementation. Conclusions LICSS is an Excel-based chemical spreadsheet with a difference: • It can usefully be used on sheets containing hundreds of thousands of compounds; it doesn't compromise the normal performance of Microsoft Excel • It is designed to be installed and run in environments in which users do not have admin privileges; installation involves merely file copying, and sharing of LICSS sheets invokes automatic installation • It is free and extensible LICSS is open source software and we hope sufficient detail is provided here to enable developers to add their own features and share with the community.</p

    A comparative evaluation of software for the analysis of liquid chromatography-tandem mass spectrometry data from isotope coded affinity tag experiments.

    Get PDF
    The options available for processing quantitative data from isotope coded affinity tag (ICAT) experiments have mostly been confined to software specific to the instrument of acquisition. However, recent developments with data format conversion have subsequently increased such processing opportunities. In the present study, data sets from ICAT experiments, analysed with liquid chromatography/tandem mass spectrometry (MS/MS), using an Applied Biosystems QSTAR Pulsar quadrupole-TOF mass spectrometer, were processed in triplicate using separate mass spectrometry software packages. The programs Pro ICAT, Spectrum Mill and SEQUEST with XPRESS were employed. Attention was paid towards the extent of common identification and agreement of quantitative results, with additional interest in the flexibility and productivity of these programs. The comparisons were made with data from the analysis of a specifically prepared test mixture, nine proteins at a range of relative concentration ratios from 0.1 to 10 (light to heavy labelled forms), as a known control, and data selected from an ICAT study involving the measurement of cytokine induced protein expression in human lymphoblasts, as an applied example. Dissimilarities were detected in peptide identification that reflected how the associated scoring parameters favoured information from the MS/MS data sets. Accordingly, there were differences in the numbers of peptides and protein identifications, although from these it was apparent that both confirmatory and complementary information was present. In the quantitative results from the three programs, no statistically significant differences were observed.</p

    Highly sensitive feature detection for high resolution LC/MS

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Liquid chromatography coupled to mass spectrometry (LC/MS) is an important analytical technology for e.g. metabolomics experiments. Determining the boundaries, centres and intensities of the two-dimensional signals in the LC/MS raw data is called feature detection. For the subsequent analysis of complex samples such as plant extracts, which may contain hundreds of compounds, corresponding to thousands of features – a reliable feature detection is mandatory.</p> <p>Results</p> <p>We developed a new feature detection algorithm <it>centWave </it>for high-resolution LC/MS data sets, which collects regions of interest (partial mass traces) in the raw-data, and applies continuous wavelet transformation and optionally Gauss-fitting in the chromatographic domain. We evaluated our feature detection algorithm on dilution series and mixtures of seed and leaf extracts, and estimated recall, precision and F-score of seed and leaf specific features in two experiments of different complexity.</p> <p>Conclusion</p> <p>The new feature detection algorithm meets the requirements of current metabolomics experiments. <it>centWave </it>can detect close-by and partially overlapping features and has the highest overall recall and precision values compared to the other algorithms, <it>matchedFilter </it>(the original algorithm of <it>XCMS</it>) and the centroidPicker from <it>MZmine</it>. The <it>centWave </it>algorithm was integrated into the Bioconductor R-package <it>XCMS </it>and is available from <url>http://www.bioconductor.org/</url></p

    Prospects for a Statistical Theory of LC/TOFMS Data

    Get PDF
    The critical importance of employing sound statistical arguments when seeking to draw inferences from inexact measurements is well-established throughout the sciences. Yet fundamental statistical methods such as hypothesis testing can currently be applied to only a small subset of the data analytical problems encountered in LC/MS experiments. The means of inference that are more generally employed are based on a variety of heuristic techniques and a largely qualitative understanding of their behavior. In this article, we attempt to move towards a more formalized approach to the analysis of LC/TOFMS data by establishing some of the core concepts required for a detailed mathematical description of the data. Using arguments that are based on the fundamental workings of the instrument, we derive and validate a probability distribution that approximates that of the empirically obtained data and on the basis of which formal statistical tests can be constructed. Unlike many existing statistical models for MS data, the one presented here aims for rigor rather than generality. Consequently, the model is closely tailored to a particular type of TOF mass spectrometer although the general approach carries over to other instrument designs. Looking ahead, we argue that further improvements in our ability to characterize the data mathematically could enable us to address a wide range of data analytical problems in a statistically rigorous manner

    Parameter selection for peak alignment in chromatographic sample profiling: objective quality indicators and use of control samples

    Get PDF
    In chromatographic profiling applications, peak alignment is often essential as most chromatographic systems exhibit small peak shifts over time. When using currently available alignment algorithms, there are several parameters that determine the outcome of the alignment process. Selecting the optimum set of parameters, however, is not straightforward, and the quality of an alignment result is at least partly determined by subjective decisions. Here, we demonstrate a new strategy to objectively determine the quality of an alignment result. This strategy makes use of a set of control samples that are analysed both spiked and non-spiked. With this set, not only the system and the method can be checked but also the quality of the peak alignment can be evaluated. The developed strategy was tested on a representative metabolomics data set using three software packages, namely Markerlynx™, MZmine and MetAlign. The results indicate that the method was able to assess and define the quality of an alignment process without any subjective interference of the analyst, making the method a valuable contribution to the data handling process of chromatography-based metabolomics data

    An iterative block-shifting approach to retention time alignment that preserves the shape and area of gas chromatography-mass spectrometry peaks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Metabolomics, petroleum and biodiesel chemistry, biomarker discovery, and other fields which rely on high-resolution profiling of complex chemical mixtures generate datasets which contain millions of detector intensity readings, each uniquely addressed along dimensions of <it>time </it>(<it>e.g.</it>, <it>retention time </it>of chemicals on a chromatographic column), a <it>spectral value </it>(<it>e.g., mass-to-charge ratio </it>of ions derived from chemicals), and the <it>analytical run number</it>. They also must rely on data preprocessing techniques. In particular, inter-run variance in the retention time of chemical species poses a significant hurdle that must be cleared before feature extraction, data reduction, and knowledge discovery can ensue. <it>Alignment methods</it>, for calibrating retention reportedly (and in our experience) can misalign matching chemicals, falsely align distinct ones, be unduly sensitive to chosen values of input parameters, and result in distortions of peak shape and area.</p> <p>Results</p> <p>We present an iterative block-shifting approach for retention-time calibration that detects chromatographic features and qualifies them by retention time, spectrum, and the effect of their inclusion on the quality of alignment itself. Mass chromatograms are aligned pairwise to one selected as a reference. In tests using a 45-run GC-MS experiment, block-shifting reduced the absolute deviation of retention by greater than 30-fold. It compared favourably to COW and XCMS with respect to alignment, and was markedly superior in preservation of peak area.</p> <p>Conclusion</p> <p>Iterative block-shifting is an attractive method to align GC-MS mass chromatograms that is also generalizable to other two-dimensional techniques such as HPLC-MS.</p

    Envelope: interactive software for modeling and fitting complex isotope distributions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>An important aspect of proteomic mass spectrometry involves quantifying and interpreting the isotope distributions arising from mixtures of macromolecules with different isotope labeling patterns. These patterns can be quite complex, in particular with <it>in vivo </it>metabolic labeling experiments producing fractional atomic labeling or fractional residue labeling of peptides or other macromolecules. In general, it can be difficult to distinguish the contributions of species with different labeling patterns to an experimental spectrum and difficult to calculate a theoretical isotope distribution to fit such data. There is a need for interactive and user-friendly software that can calculate and fit the entire isotope distribution of a complex mixture while comparing these calculations with experimental data and extracting the contributions from the differently labeled species.</p> <p>Results</p> <p>Envelope has been developed to be user-friendly while still being as flexible and powerful as possible. Envelope can simultaneously calculate the isotope distributions for any number of different labeling patterns for a given peptide or oligonucleotide, while automatically summing these into a single overall isotope distribution. Envelope can handle fractional or complete atom or residue-based labeling, and the contribution from each different user-defined labeling pattern is clearly illustrated in the interactive display and is individually adjustable. At present, Envelope supports labeling with <sup>2</sup>H, <sup>13</sup>C, and <sup>15</sup>N, and supports adjustments for baseline correction, an instrument accuracy offset in the m/z domain, and peak width. Furthermore, Envelope can display experimental data superimposed on calculated isotope distributions, and calculate a least-squares goodness of fit between the two. All of this information is displayed on the screen in a single graphical user interface. Envelope supports high-quality output of experimental and calculated distributions in PNG or PDF format. Beyond simply comparing calculated distributions to experimental data, Envelope is useful for planning or designing metabolic labeling experiments, by visualizing hypothetical isotope distributions in order to evaluate the feasibility of a labeling strategy. Envelope is also useful as a teaching tool, with its real-time display capabilities providing a straightforward way to illustrate the key variable factors that contribute to an observed isotope distribution.</p> <p>Conclusion</p> <p>Envelope is a powerful tool for the interactive calculation and visualization of complex isotope distributions for comparison to experimental data. It is available under the GNU General Public License from <url>http://williamson.scripps.edu/envelope/</url>.</p

    BioSunMS: a plug-in-based software for the management of patients information and the analysis of peptide profiles from mass spectrometry

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With wide applications of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) and surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS), statistical comparison of serum peptide profiles and management of patients information play an important role in clinical studies, such as early diagnosis, personalized medicine and biomarker discovery. However, current available software tools mainly focused on data analysis rather than providing a flexible platform for both the management of patients information and mass spectrometry (MS) data analysis.</p> <p>Results</p> <p>Here we presented a plug-in-based software, BioSunMS, for both the management of patients information and serum peptide profiles-based statistical analysis. By integrating all functions into a user-friendly desktop application, BioSunMS provided a comprehensive solution for clinical researchers without any knowledge in programming, as well as a plug-in architecture platform with the possibility for developers to add or modify functions without need to recompile the entire application.</p> <p>Conclusion</p> <p>BioSunMS provides a plug-in-based solution for managing, analyzing, and sharing high volumes of MALDI-TOF or SELDI-TOF MS data. The software is freely distributed under GNU General Public License (GPL) and can be downloaded from <url>http://sourceforge.net/projects/biosunms/</url>.</p
    corecore